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METHOD OF GENERATING AN EXCEPTIONAL PRONUNCIATION 
DICTIONARY FOR AUTOMATIC KOREAN PRONUNCIATION GENERATOR 

TECHNICAL FIELD 

5 The present invention relates to a method of generating an exceptional 

pronunciation dictionary for automatic Korean pronunciation generator in a Text-to- 
Speech system or in an automatic speech recognition system. 



BACKGROUND OF INVENTION 

1 0 Conventionally, a method for automatic Korean pronunciation generator as 

shown in FIG. 1 comprises the steps of analyzing and pre-processing inputted text; 
analyzing morphemes of the text; tagging POS (part of speech); and generating 
pronunciations based on an exceptional pronunciation dictionary and a part of 
regular rules for changing phonemes. The automatic Korean pronunciation 

15 generator is characterized by two parts: the dictionary of exceptional words and the 
part of regular rules for changing phonemes. The exceptional words have been 
recorded in the dictionary for exceptional words in a simple and random manner, 
whereas the researches on the regular rules for changing phonemes have been 
actively progressed. 

2 0 One example of regular rules is the Fortition of lenis consonant 1 , e.g., a 

Korean word '^^(klkpi)' is pronounced as [^" B11 l (klkbi)]. Thus, it is the Fortition 
rule that the Korean letter 'tJ (p)' after '~i (k)' is pronounced as [HH(b)]. The Fortition 
rule actually includes that 'i=(t), ~i(k), >Ms), as well as '*(p)' after "~i(k)' 

are respectively pronounced as [^(d), ~n(g), M (S), **(z)]. When a Korean 

25 obstruent letter, '^(p), ^(t), ~i(k), ^(s), or ^(c)' of a Korean word is positioned 
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after another Korean obstruent letter, the 't*(p), *=(t), ~>(k), 7s( s ), t;( c )' are 
respectively pronounced as [ titi (b), ^(d), ~n(g), M (S), **(z)]. This Fortition Rule 
has no exceptions in a given environment. 

On the contrary, alternative pronunciations can be observed in a certain 
5 context, in which the choice of the pronunciation depends on the words 
(idiosyncratic). And it is impossible to make rules for these words, which should be 
classified as words for the Exceptional Pronunciation Dictionary in TTS or ASR. For 
example, '#uI7][mulkoki]' and '1^017] [pulkoki]' are respectively realized as 
[#2iZ7l][mulgoki] and Jl7l][pulkoki]. In / l:JI7l[bulkoki] , / a letter 'n[k]' 

10 located after a letter 'Hp]' is pronounced as ["'•Ilk], while in 'ir JL7] [mulkoki]', a 
letter '~i[k]' located after a letter is pronounced as [~n][g]. The Fortition in 

[#317l ][mulgoki] is an exceptional case, which is not predictable, and needs to be 
recorded as an entry of the Exceptional Pronunciation Dictionary. 

A generating process of the exceptional pronunciations in Korean has been 

15 known as a challenging task to be solved in the TTS system and the speech 
recognition system in Korean, but very little research has been conducted on this 
matter, for which, the characteristics of words having the exceptional pronunciations 
need to be dealt with in advance. 



2 0 DISCLOSURE OF INVENTION 

Therefore, it is an object of the present invention to provide a method for 
generating an exceptional pronunciation dictionary for automatic Korean 
pronunciation generator by reviewing the words which have exceptional 
pronunciations from text corpus based on the characteristics of the words of 
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exceptional pronunciations through phonological research and text analysis of 
Korean language. 

BRIEF DESCRIPTION OF DRAWINGS 

5 This invention will be better understood and its various objects and 

advantages will be fully appreciated from the following descriptions taken in 
conjunction with the accompanying drawings, in which: 

FIG. 1 shows a block diagram of an automatic pronunciation generator; 
FIG. 2 indicates a method for compiling an exceptional pronunciation 
1 0 dictionary 1 using a general dictionary; and 

FIG. 3 indicates a method for compiling a new exceptional pronunciation 
dictionary 2 using text corpus. 

BEST MODE FOR CARRYING OUT THE INVENTION 

15 This invention is comprised of the steps of (1) setting exceptional sound 

conditions; (2) compiling an exceptional pronunciation dictionary using general 
dictionaries; and (3) compiling the exceptional pronunciation dictionary using text 
corpus. 

The step of setting exceptional pronunciation conditions establishes the 
2 0 phoneme conditions where the exceptional pronunciations are observed based on 
the systematic research through the Korean phonology and the text analysis. 
Although it has been thought that the phoneme conditions of exceptional 
pronunciations cannot be explained with any rules, the disclosed shows its 
regularity based on thorough researches. Accordingly, the words showing 
2 5 exceptional pronunciations in Korean are observed in certain limited conditions. 



3 



PPW05-074 



The step of generating the exceptional pronunciation dictionary includes the 
following two steps. 

The first step is to generate an exceptional pronunciation dictionary by 
analyzing words having the exceptional pronunciations in a general Korean 
5 dictionary. By using a general Korean dictionary, the repetition of vocabulary can be 
minimized and also different kinds of vocabulary can be included in the 
exceptional pronunciation dictionary. The general Korean dictionary used as an 
analyzing object in this research is the YEONSEI KOREAN DICTIONARY (YKD 
henceforth), which has a record of about 50,000 entry words of high frequency. To 

10 generate an exceptional pronunciation dictionary, the exceptional condition 
reference dictionary which includes the words appearing in the exceptional 
pronunciation conditions needs to be established using YKD. The exceptional 
pronunciation dictionary is to be generated by manual review of the words listed in 
the exceptional condition reference dictionary. 

15 However, vocabularies excluded in the general dictionary are also used in 

actual economic and social life. Furthermore, a number of vocabularies are being 
coined in variable conditions of life, such as the new words observed in the texts of 
newspapers or broadcasts, which should be extracted and listed in the exceptional 
pronunciation dictionary. 

20 

(1) Setting exceptional pronunciation conditions 

The exceptional pronunciation conditions mean phonological conditions in 
which the exceptional pronunciations are observed. 
2 5 Accordingly, a research was preceded for systematic phonological 
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conditions based on the characteristics of the words of exceptional pronunciations 
through text analysis. 

The words which have exceptional pronunciations are nouns and their 
derivatives, which are declinable parts of speech in Korean. 
5 In the following description, phonological conditions are disclosed where 

the exceptional pronunciations are observed. 

Generally, phonological conditions include 4 different cases: the first case is 
when a vowel follows a consonant; the second, when a consonant follows a 
preceding consonant; the third, when a vowel follows a vowel, and the fourth is 

10 when a vowel follows a consonant. 

Among the above 4 cases, the phonological conditions for the exceptional 
pronunciations are the second case, when a consonant follows another preceding 
consonant, and the fourth case, when a vowel follows a consonant. When a 
consonant follows another preceding consonant, the preceding consonant is a voiced 

15 sound such as "^[m], ^-[n], o [o], s [l]", and the following consonant is a lenis 
sound. In this context, there are no regular phoneme rules that can be applied, but 
the words with lenis sound are pronounced as fortis depending on words. An 
example is already shown above. '#o!7l [mulkoki]' and '-H- 517] [pulkoki]' are 
respectively realized as [#227] ][mulgoki] and [^1:^17]] [pulkoki]. In 

2 0 7 1-Jl7l [bulkoki]', a letter 'n [k]' located after a letter 'e [1]' is pronounced as [~i ][k], 
while in '# JL7] [mulkoki]', a letter '~J[k]' located after a letter is pronounced 

as [^Hg]. These words, which have different pronunciations in the same phoneme 
context, are exceptional pronunciation words and eventually recorded in the 
exceptional pronunciation dictionary. 

25 



5 



PPW05-074 



When a vowel follows a consonant, there can be observed two cases detailed 
as follows. In one case, when the consonant is ">Ms]", the "a[s]" is respectively 
pronounced as "Mn]" and "^[t]" in the same condition, for example, "°l-9M[a- 
lEn-ni]" and "5J& 0 l[tvt-vp-si]". In the other case, a letter "Mn]" is inserted 
5 between the consonant and the vowel. For example, "^[aP^Ii!]" is pronounced as 
[Q'd, am-nil]. 

In this invention, the conditions of the exceptional pronunciations are 
arranged based on the analytical research of YKD. 

The following table 1 shows the conditions in which the exceptional 
10 pronunciations are observed, and the table 2 shows examples for each condition. 




15 [Table 2] Examples of exceptional pronunciations 



«[p] 



o[ m ] 




-l[k] 




o[N] 


V(i/y) 
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fe7> 








[n] 


[nun- 


[non- 


[nun- 


[kwan- 


[nun-ga] 


[bvm- 








byvN] 


duk] 


Sal] 


zvm] 




sin- 
non] 






o 






**r 












[N] 


[dEN- 

•yvt] 


[caN- 
dok] 


tdaN- 
Sok] 


[doN- 
zvk] 


[daN-gul] 
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s m 






*4t 














[dll-bo] 


[kal-dE] 


[kyvl- 
San] 


[kyvl-zE] 


[dfl-gvt] 








C 
















[am- 
nil] 


A[ S ] 














[ut- 
ot] 





(2) Compiling an exceptional pronunciation dictionary using an general 
dictionary (YKD) 

5 

A reference dictionary 1 is compiled by extracting the words (using the 
Table 1) in the exceptional conditions from the entries of a general dictionary which 
includes basic words of the Korean language. 

A researcher manually reviews words of the reference dictionary 1 in the 
10 exceptional conditions and edits an exceptional pronunciation dictionary 1 by 
collecting words which show exceptional pronunciations. 

(3) Compiling an exceptional pronunciation dictionary based on text corpus 
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The text corpus are basically an assemblage of sentences, which are to be 
analyzed, pre-processed, and divided into Eojols (units surrounded by space). Then 
the Eojols in the exceptional conditions will form the vocabulary dictionary 1 in the 
exceptional conditions. 
5 Next, the vocabulary dictionary 1 in the exceptional conditions are 

compared with the words included in the reference dictionary 1 in the exceptional 
conditions generated in the previous step. As a result of the comparison, the 
vocabulary dictionary 2 in the exceptional condition is to be generated, after 
removing repeated words. 

10 The exceptional pronunciation dictionary 2 is compiled by extracting 

additional words having exceptional pronunciations through manual review of the 
vocabulary dictionary 2 in the exceptional condition. 

The new reference dictionary 2 in the exceptional conditions is created by 
editing the vocabulary dictionary 2 in the exceptional condition and the reference 

15 dictionary 1 in the exceptional condition. However, when an exceptional 
pronunciation dictionary is edited from a new text corpora, the new reference 
dictionary 2 for the exceptional condition will be used as the reference dictionary. 

Thus, the method contributes to the performance improvement of automatic 
pronunciation generator in Korean as well as the performance improvement of 

2 0 speech recognition system and TTS system in Korean. 
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